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Abstract 

The Bethe-Salpeter equation (BSE) is a reliable model for estimating the absorption 
spectra in molecules and solids on the basis of accurate calculation of the excited states 
from first principles. This challenging task includes calculation of the BSE operator 
in terms of two-electron integrals tensor represented in molecular orbital basis, and 
introduces a complicated algebraic task of solving the arising large matrix eigenvalue 
problem. The direct diagonalization of the BSE matrix is practically intractable due to 
0(IV 6 ) complexity scaling in the size of the atomic orbitals basis set, N. In this paper, 
we present a new approach to the computation of Bethe-Salpeter excitation energies 
which can lead to relaxation of the numerical costs up to 0(N 3 ). The idea is twofold: 
first, the diagonal plus low-rank tensor approximations to the fully populated blocks 
in the BSE matrix is constructed, enabling easier partial eigenvalue solver for a large 
auxiliary system relying only on matrix-vector multiplications with rank-structured 
matrices. And second, a small subset of eigenfunctions from the auxiliary eigenvalue 
problem is selected to build the Galerkin projection of the exact BSE system onto the 
reduced basis set. We present numerical tests on BSE calculations for a number of 
molecules confirming the e-rank bounds for the blocks of BSE matrix. The numer¬ 
ics indicates that the reduced BSE eigenvalue problem with small matrices enables 
calculation of the lowest part of the excitation spectrum with sufficient accuracy. 

Key words: Bethe-Salpeter equation, Hartree-Fock equation, two-electron integrals, tensor 
decompositions, model reduction, reduced basis, truncated Cholesky factorization. 
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1 Introduction 

In modern material science there is a growing interest to the ab initio computation of absorp¬ 
tion spectra for molecules or surfaces of solids. Due to model limitations, the first principles 

*Max Planck Institute for Dynamics of Complex Systems, Magdeburg (benner@mpi-magdeburg.mpg.de) 

**Max Planck Institute for Mathematics in the Sciences, Leipzig; Max Planck Institute for Dynamics of 
Complex Systems, Magdeburg (vekh@mis.mpg.de). 

°Max Planck Institute for Mathematics in the Sciences, Inselstr. 22-26, D-04103 Leipzig, Germany 
(bokh@mis.mpg.de). 


1 



DFT or the Hartree-Fock calculations do not allow reliable estimates for excitation energies 
of molecular structures. One of the approaches providing means for calculation of the excited 
states in molecules and solids is based on the solution of the Bethe-Salpeter equation (BSE) 
[271 EH [251 EH [29]. The alternative methods treat the problem by using the time-dependent 
DFT or Green’s function approach [301 DDEE El Ell ESI '2SJ • The approximate coupled clus¬ 
ter calculations of electronic excitation energies by using rank decompositions have been 
described in Til . 

The BSE model, originating from high energy physics and incorporating the many-body 
perturbation theory and the Green’s function formalism, governs calculation of the excited 
states in a self-consistent way. The BSE approach leads to the challenging computational 
task on the solution of the eigenvalue problem for a large fully populated matrix, that is in 
general non-symmetric. It is worth to note that the size of BSE matrix scales quadratically in 
the size of the basis set, O(N^), used in ab initio electronic structure calculations. Hence the 
direct diagonalization is limited by 0(7V®) complexity making the problem computationally 
extensive already for moderate size molecules with the size of the atomic orbitals basis set, 
Nb rs 100. Furthermore, the numerical calculation of the matrix elements, based on the 
precomputed two-electron integrals (TEI) in the Hartree-Fock molecular orbitals basis, has 
the numerical cost that scales at least as O(N^). In the case of non-periodic lattice structured 
compounds (e.g. nano-structures) the number of basis functions increases proportionally to 
the lattice size that easily leads to intractable problems even for small lattices. Hence, a 
procedure that relies entirely on multiplications of a governing BSE matrix with vectors is 
the only viable approach. 

In this way, the numerical solution of the BSE eigenvalue problem introduces several 
computationally extensive sub-problems. The commonly used approaches for fast matrix 
computations are based on the use of certain specific data-sparse structures in the target 
matrix. Here, we do not consider the sparsity based on the truncation of small elements 
under the given threshold, since, in the particular case of BSE problem, it seems hardly 
implementable because of the highly non-regular sparsity pattern arising. 

In this paper we study the new approach to the solution of the BSE spectral problem 
based on the model reduction via the reduced basis which is determined by the eigenvec¬ 
tors of a simplified system matrix of low-rank plus diagonal structure. We investigate the 
approximation error of the reduced basis set depending on the rank truncation parameters. 
The theoretical and numerical analysis of the existence of the low-rank approximation and 
the respective rank bounds for different matrix blocks in the BSE matrix is presented. 

This approach includes the low-rank decomposition of the matrix blocks in the Bethe- 
Salpeter kernel using the Cholcsky factorization of the two-electron integrals (TEI) [16, IT5] 
represented in the Hartree-Fock molecular orbitals basis. The simplified block decomposition 
in the BSE system matrix is determined by the separation rank of order O(Nb), which enables 
compact storage and fast matrix-vector multiplications in the framework of iterations on a 
subspace for finding a small part of the spectrum. This opens the way to the rank-truncated 
arithmetics of the reduced complexity, O(N^), that may gainfully complement the existing 
numerical approaches to the challenging BSE model. 

Methods for solving partial eigenvalue problems for matrices with special sparse structure 
have been intensively studied in the literature, see for example, 0IH EQl El El EH and 
references therein. Brief surveys of commonly used rank-structured tensor formats can be 
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found in IT). [TSJ [B] and in references therein. 

The main computational tasks in the presented approach include the following steps: 

• Precompute the TEI tensor in the Hartree-Fock molecular orbital basis in the form of 
low-rank factorization. 

• Setting up matrix blocks in the BSE matrix that includes solution of the linear matrix 
equation with the identity plus low-rank governing matrix and low-rank right-hand 
side. 

• Compute the low-rank approximation to the selected sub-matrices in the BSE matrix 
subject to the chosen threshold criteria. 

• Construct the reduced basis set composed from eigenvectors corresponding to several 
lowest eigenstates of the rank-structured approximation to BSE matrix from the pre¬ 
vious step. 

• Project the initial BSE matrix onto the reduced basis and diagonalize the arising 
moderate size Galerkin matrix. 

• Select the essential part in the spectrum of the projected Galerkin matrix and build 
the predicted excitation energies and the respective eigenstates. 

The design of the efficient linear algebra algorithms for fast solution of arising large 
eigenvalue problems with rank-structured matrix blocks will be the topic for future work. 

The rest of the paper is organized as follows. Section [2] outlines the truncated Cholesky 
decomposition scheme for low-rank factorization of the two-electron integrals tensor in the 
Hartree-Fock molecular orbitals basis, that is the building block in the construction of the 
BSE matrix. Section |3] describes the algebraic computational scheme for evaluation of the 
entries in the BSE matrix, analyses low-rank structure in the different matrix blocks and 
describes the reduced basis approach. We also analyze numerically the error of the com¬ 
monly used simplified BSE model, the so-called Tamm-Dancoff (TDA) equation. The most 
numerically extensive part in computation of the BSE matrix blocks is reduced to finding 
the low-rank solution of the matrix equation with the diagonal plus low-rank structure in the 
governing matrix. Numerical tests indicate the convergence in the senior (lowest) excitation 
energies by increase of the separation ranks. Conclusion summarizes the main algorithmic 
and numerical features of the presented approach and outlines further prospects. 


2 Low-rank approximation of the two-electron inte¬ 
grals in Hartree-Fock calculus 

2.1 Cholesky decomposition of the TEI matrix 

The numerical treatment of the two-electron integrals (TEI) is the main bottleneck in the 
numerical solution of the Hartree-Fock equation and in DFT calculations for large molecules. 
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Given the atomic orbitals basis set {fj\<n<N h i G i/ 1 (M 3 ), and the associated two- 
electron integrals (TEI) tensor B = \b pv \ a \ (see (15.51) in Appendix), the associated N% x N% 
TEI matrix over the large index set Z x J, I = J = X b with X b := {1, N b }, 

B = mat{ B) = [b^. M \ e R N ? xN ? , 

is obtained by matrix unfolding of a tensor B = [b pu \ a \ . The TEI matrix B is proven to 
be symmetric and positive definite. The optimized Hartree-Fock calculations are based on 
the incomplete Cholesky decomposition [T| [321 12) El .26] [24], of the symmetric and positive 
definite matrix B , 

B^LL t , LeR N * xRB , R b = 0(N b ). (2.1) 

For this computation we apply the new Cholesky decomposition scheme, see [13 HE], where 
the adaptively chosen column vectors are calculated in the efficient way by using the pre¬ 
computed redundancy free factorization of the TEI matrix B (counterpart of the density 
fitting scheme). This allows the partial decoupling of the index sets {/Uzz} and {Act}. 

Notice that the Cholesky factorization (12.1|) can be written in the index form 

Rb 

E X fe ( / u; zz)X fc (cr; A), (2.2) 

k= 1 

where the second factor corresponds to the transposed matrix L R . Here L k = L fc (/i;z/), 
k = 1, ...,R B , denotes the N b x N h matrix unfolding of the vector L(:,k ) in the Cholesky 
factor L 6 R N b xRB . 

The results of various numerical experiments indicate that the truncated Cholesky decom¬ 
position with the separation rank 0(N b ) ensures the satisfactory numerical precision e > 0 
of order 10~ 5 - 10 -6 . The refined rank estimate 0(N b \ loge|) was observed in numerical 
experiments for every molecular system we considered so far [TEI HE] . 

In the standard quantum chemical implementations in the Gaussian-type atomic orbitals 
basis the numerically confirmed rank bound rank(B) < C B N b ( C B is about several ones) 
allows to reduce the complexity of building up the Fock matrix F to O(Nj^), which is by far 
dominated by computational cost for the exchange term K(D), see Appendix. 

2.2 Rank estimates for TEI matrix V and numerical illustrations 

Given the complete set of Hartree-Fock molecular orbitals { C p 6 i.e. the column 

vectors in the coefficients matrix C G R NbXNb , and the corresponding energies {£ p }, p = 
1,2 ,...,N b (see Appendix, where A* correspond to e p ). I 11 commonly used notations, {C,} 
and {C a } denote the occupied and virtual orbitals, respectively. 

For BSE calculations, one has to transform the TEI tensor B = [b lw \ a ] , corresponding to 
the initial AO basis set, to those represented in the molecular orbital (MO) basis, 


N b 

V • Viajb ^ ^ C'lJtiC'vaC\jCabbfu/,Xa , 4 bJ £ {1, ..., A^,}. (2.3) 
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The BSE calculations make use of the two subtensors of V specified by the index sets 
X 0 := {1, N or b } and X v := {N or b + 1, N b }, with N or b denoting the number of occupied 
orbitals (see Appendix). The first subtensor is defined as in the case of MP2 calculations, 

V . o, b G X v , z, j G X 0J (2.4) 

while the second one lives on the extended index set 


N . z 1 , s G X V j G X 0 . 


(2.5) 


In the following, we shall use the notation N v — Nb ■ 
Denote the associated matrix by V = [Vi a jb} £ 


N n . 




P N*xN? 


N ov — N or bN v . 

in case m, and similar by 


’ orb■) ^ ov ■L' orb 

D N ov x N c 


in case (12.51) . The straightforward computation of the matrix V by 


above representations makes the dominating impact to the overall numerical cost of order 
O(N^) in the evaluation of the block entries in the BSE matrix. The method of complexity 
O(Nf) based on the low-rank tensor decomposition of the matrix V was introduced in [15J 
(see §2.2). 

It can be shown that the rank Rb = 0(A),) approximation to the TEI matrix B pa LL T , 
with the N x Rb Cholesky factor L, allows to introduce the low-rank representation of the 
tensor V, and then reduce the asymptotic complexity of calculations to O(Nf), see [T5j. 
Indeed, let C m be the m-th column of the coefficient matrix C = {O m ,} G R NbXNb . Then 
substitution of (12. 2 j) to (12.3)) in case (12. 4 j) leads to 


Rb Ni 

Riajb E E C f j,iC 1/a CxjC (T bLk(n; v)L k {cr, A) (2.6) 

k= 1 /z,z/,A,(T=1 

Rb f N b \ / Nb 

= E S C*C va L k {p,v) Y, Cxfi ab L k {o- A) 

k =1 \ij.m= 1 / \A,cr=l 

Rb Rb 

= E( c? L k c a )(c£ Lie;) = E(cfnc«)(cfna) T . 

k =1 k =1 

Similar factorization can be derived in case (12.5)) . The precise formulation is given by the 
following lemma [15], which will be used in further considerations. 

Lemma 2.1 Let the rank-Rs Cholesky decomposition of the matrix B be given by ld.il) . 
then the matrix unfolding V = [vi a -jb\ allows a rank decomposition with rank(V ) < Rb- The 
Rs-term representation of the matrix unfolding V = [v ia -j b ) takes a form 

V — L v Ly, L v eR N - xRB , 


where the column vectors are given by 

Tyiff 1) A',. ir a N orbl k) h) L k C a , A 1,..., Rb , R ^ X t - . % G X Q . 

In case we have V = U V W$ G mth Uy e r n%xr b and Wy e r n^xr b _ 
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Representation (j2.6j) indicates that it is necessary to compute and store the only L v , Uy 
and Wy factors in the above rank-structured factorizations. 

Lemma 12.11 provides the upper bounds on rank(V) in the representation (12. 6 ft which 
might be larger than that obtained by the £-rank truncation. It can be shown that the e- 
rank of the matrix V remains of the same magnitude as those for the TEI matrix B obtained 
by its £-rank truncated Cholesky decomposition (see numerics in 1 13.2ft . 

Numerical tests in [1(7] indicate that the singular values of the TEI matrix B decay 
exponentially as 

a k < Ce~^ k , (2.7) 

where the constant z > 0 depends weakly on the molecule configuration. If we define Rb(e) 
as the minimal number satisfying the condition 

Rb 

E ^ 2 . 

k=R,B (s) - ! - ! 


then estimate (12.7ft leads to the £-rank bound R B (s) < CN b \ loge|, which will be postulated 
in the following discussion. 

Our goal is to justify that Ry{e) increases only logarithmically in e, similar to the bound 
for Rb(e). To that end we introduce the SVD decomposition of the matrix B, 

B = ud b u t , u e R n * xRb , d b e r R bxRb , 

which can be written in the index form 

Rb 

E cr k U k (ii; u)U k (a; A), (2.8) 

k =1 

with U k = [U k (fi;u)] E R N >> xN b and ||£ 4 ||f = 1 , k = 1,...,R B . 

Lemma 2.2 For given £ > 0, there exists a rank-r approximation V r to the matrix V such 
that r < R B {e) and 

\\V r — V\\ < CN b e\ loge|, 
where the constant C does not depend of e. 

Proof. We estimate the J?s(e)-term truncation error by using representation (12.8ft . 


Rb N b 

Riajb ^ ^ ^ ^ C^iCjy a C \j CabUk^fA., , A) 

k= 1 fi,u,X,a=l 

Rb / N b \ / Nb 

= E<* E CiiiCvoJJk (/^? I ( ^ ^ CxjCMa, A) 

k =1 1 / \A,cr= 1 

Rb Rb 

= J^a k (C?U k C a )(C R U R C J ) = J^a k (C?U k C a )(CjU k C b ) T , 

k =1 k =1 


(2.9) 
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Rb 

which can be presented in the matrix form V = Ofcf4V A T , where 14(i; a) = CjUkC a - By 

k =1 

Rb 

definition of Rb(s) we have cr| < e 2 . Hence, the error of rank-.Kg (e) approximation 

k=Rs (s)H-l 

Rb(s) 

defined by V r = ^kVk^k i can be bounded by 

k=\ 


Rb Rb Rb 

II E a tV t vI\\<( E <7 3 1/2 ( E IIUII 4 ) 172 (2.10) 

fc=-Rs(e)+l fc=iis(e)+l fc=i?s(e)+l 

Rb 

<4 E P4|| 4 ) 1/2 

fc=i? fl (e)+l 

<£( J R s - J R s (e))||C'|||J|C'||| u , 

taking into account that ||C4|| = 1, k = 1 ,..., Rb , and the Frobenius norm estimate 

null 2 = ||U(i;a)|& = \\cfu k cx < ||f/ t || 2 E IlCif IlCall 2 < E IIOItE IIC'oll 2 - 

i,a i a 

We suppose that R B = 0(N b \ loge|), then the multiple of e\ loge| in (12.101) does not depend 
on e, that proves our lemma. ■ 

The storage cost of these decompositions restricted to the active index set I v x I 0 amounts 
to Ry(£)N v N orb . The complexity of straightforward computation on the active index set can 
be estimated by 0 (RbN 2 N ov ). 




Figure 2.1: Singular values of V for several moderate size molecules. 

Figure l2~TI represents the singular values of the matrices V (red) and B (blue) for H 2 0, 
N 2 H 4 , and C 2 H 5 N0 2 (Glycine amino acid) molecules with the size of the basis set (N b , N orb ) 
equals to (41,9), (82,9), and (170,20), respectively. It can be seen that R.\/( £ ) is linear 
proportional to | loge| and it is of the same order of magnitude as Rb{z)- 

These calculations are based on the reduced truncated SVD algorithm applied to the 
initial Cholesky decomposition of the matrix V inherited from those for the TEI matrix B, 
see Lemma [2.21 and (12.91) . 
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3 Block tensor factorization in BSE system matrix 

Here we discuss the main ingredients of the computational scheme for calculation of blocks 
in the BSE matrix and their reduced rank approximate representation. 


3.1 Tensor representations using TEI matrix in MTO basis 

The construction of BSE matrix includes computation of several auxiliary quantities. First, 
introduce a fourth order diagonal ’’energy” matrix by 

Ae = [Ae ia j b \ £ R N ™ xNov : Ae iajjb = (e a - £i)S i:j S ab , 

that can be represented in the Kronecker product form 

Ae = / 0 0 diag{e a : a £ X v } - diag{e* : i £ X 0 } 0 I v , 

where I a and I v are the identity matrices on respective index sets. It is worth to note that 
if the so called homo lumo gap of the system is positive, i.e. 

£ a £% ^ d O 0, CL £ X v , X £ X Q: 


then the matrix Ae is invertible. 

Using the matrix Ae and the N ov x N ov two-electron integrals matrix V = [vi a jb] repre¬ 
sented in the MO basis as in (12. ‘ill , the dielectric function (N ov x N ov matrix) Z = [z pq ^s] is 
defined by 


Zpq,rs ■ dp r 5q S Vpq rs [%q (cU U)]rs,rs; 


with X 0 ( u; ) being the matrix form of the so-called Lehmann representation to the response 
function. In turn, the matrix representation of inverse in known to have a form 


XoV) 





implying 


Xo(0) 




Let 1 £ M JVo '' and d e = diag{ Ae” 1 } £ R No,J be the all-ones and diagonal vectors of Ae” 1 , 
respectively, specifying the rank-1 matrix 1 0 d £ . In this notations the matrix Z = [z pq ^f\ 
takes a compact form 

Z = I 0 0 I v + V 0 (1 • dj) , 

where © denotes the Hadamard product of matrices. Introducing the inverse matrix Z -1 , 
we finally, define the so-called static screened interaction matrix (tensor) by 


II [^p5,rs] • 'UJpq,rs ■ ^ ^ ^pq^tu^tu^rs- (3.1) 

t£l v ,u£X 0 

In the forthcoming calculations this equation should be considered on the conventional and 
extended index sets {p, s £ X D } U {q,r £ X v }, and {p, q £ X a } U {r, s £ X v }, respectively, 
corresponding to subtensors in (j2.4jl and in (12.5j) . 





Hence, on the conventional index set, we obtain the following matrix factorization of 

W := [Wia ijb ], 

W = Z~ X V provided that a, b G X v , i , j G X Q , 

where H is calculated by (12.4ft . while on the index set {p, q G X c } U {r, s 6l„} the modified 
matrix IH = is computed by (13.11) and (12.51) . 

Now the matrix representation of the Bethe-Salpeter equation in the ( ov,vo ) subspace 
reads as the following eigenvalue problem determining the excitation energies ui n : 


F 







where the matrix blocks are defined in the index notation by 


(3.2) 


dia,jb ■ ^£ia,jb d~ ^ia,jb U)ij,abi 


bia,jb ■ Vi a ,bj ^ib,ajt CL : b G X V: 2, j G X Q . 


In the matrix form we obtain 

A = Ae + V - W, 

where the matrix elements in W = [nkajfe] are defined by Wi a ,jb = Wi],ab-, computed by (13.11) 
and (12.5|) . Here the diagonal plus low-rank sparsity structure in Ae + V can be recognized 
in view of Lemma f2.ll For the matrix block B we have 


B = V -W = V -W, 

where the matrix V, corresponding to the partly transposed tensor, coincides with V, 

"V • [Viabj ] ; 

and W is dehned by permutation W = [wi a ,jb] = [ w ib.aj\■ I 11 th e following, we will investigate 
the reduced rank structure in the matrix blocks A and B resulted from the corresponding 
factorizations of V. 

Solutions of equation (13.2j) come in pairs: excitation energies ui n with eigenvectors 
(x n ,y n ), and de-excitation energies — u n with eigenvectors (x*,y*). 

The block structure in matrices A and B is inherited from the symmetry of the TEI 
matrix V, v w ,j^ = v* l bj and the matrix W, Wi a jb = u>bj,ai- I n particular, it is well known 
from the literature that the matrix A is Hermitian (since Via,jb = V*b, ia and W ijt ab = W* iM ) 
and the matrix B is symmetric (since v ia ^j = Vjb,ai and Wib, a j = w j a .bi)- 

In the following, we confine ourself by the case of the real spin orbitals, i.e. the matrices 
A and B remain real. It is known that for the real spin orbitals and if A + B and A — B 
are positive definite, the problem can be transformed into a half-size symmetric eigenvalue 
equation |6j. Indeed, in this case for every eigenpair we have, 


implying 


Hx + By = ojx, Bx + Ay = — uy, 


(■ A + H)(x + y) = u(x - y), (A - B)(x - y) = u(x + y). 
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Now, if A + B and A — B are both positive definite, then the previous equations transform 
to 

AIz = uj 2 z with M = (A-B) 1/2 (A +B)(A-B) 1/2 , (3.3) 

with respect to the normalized eigenvectors z = y/uj(A — i?) 1 / 2 (x + y). However, in this case 
the computation of the large fully populated matrix {A — B) 1 ' 2 may become the bottleneck. 

The dimension of the matrix in (13. 2 p is 2 N a N v x 2 N 0 N V , where N 0 and N v denote the 
number of occupied and virtual orbitals, respectively. In general, N a N v is asymptotically of 
the size 0(N b ), i.e. the spectral problem (j3.2[) becomes computationally extensive already 
for moderate size molecules with N b « 100. Indeed, the direct eigenvalue solver for (13.2j) (di- 
agonalization) becomes non-acceptable due to complexity scaling 0(N b ). Furthermore, the 
numerical calculation of the matrix elements, based on the precomputed TEI integrals from 
the Hartree-Fock equation, has the numerical cost that scales as 0(N b ) — 0(N b ) depending 
on how to compute the matrix W. Here again we propose to adapt the low-rank structure 
in the matrix V. 

The most challenging computations arise in the case of lattice structured compounds, 
where the number of basis functions increases proportionally to the lattice size L x L x L, 
i.e. N b ~ uoL 3 , that quickly leads to intractable problems even for small lattices. 

3.2 Eigenvalues in an interval by the low-rank approximation 

The large matrix size in equation (13. 2 p makes the solution of full eigenvalue problem com¬ 
putationally intractable even for moderate size molecules, not saying for lattice structured 
compounds. Hence, in realistic quantum chemical calculations of excitation energies the 
computation of several tens eigenpairs may be sufficient. Methods for solving partial eigen¬ 
value problems and matrix inversion for large matrices with special sparsity pattern have 
been intensively studied in the literature, see for example, [22], EH and references therein. 

3.2.1 The reduced basis approach by low-rank approximation 

In what following we show that the part Ae + V in the matrix block A has the diagonal 
plus low-rank (DPLR) structure, while the sub-matrix V in the block B exhibits the low- 
rank approximation. Taking into account these structures we propose the special partial 
eigenvalue problems solver based on the use of reduced basis set obtained from as the eigen¬ 
vectors of the reduced matrix that picks up only the essential part of the initial BSE matrix 
with the DPLR structure. The iterative solver is based on fast matrix-vectors multiplication 
and efficient storage of all data involved into the computational scheme. Using the reduced 
basis we than solve the initial problem by the Galerkin projection onto the reduced basis of 
moderate size. 

Another direction that includes the QTT analysis of the matrices involved in order to 
perform the fast matrix calculations in the QTT format will be considered elsewhere. 

We begin from the low-rank decomposition of the matrix V, 

V « L v Ly, L v e R JV « xflv , R v < Rb- 

where the rank parameter R v = Rv(e) = 0(N b | loge|) can be optimized depending on the 
truncation error e > 0 (see [IB] and H2.2[) . 
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First, we represent all matrix blocks and intermediate matrices included in the represen¬ 
tation of the BSE matrix by using the above decomposition and diagonal matrices as follows. 
The properties of the Hadamard product imply that the matrix Z exhibits the representation 

Z = I 0 ® I v + L v Ly 0 (l • dj) = I Nov + L V (L V © d £ ) T , 

where the rank of the second summand does not exceed Ry. Hence the matrix inversion Z~ 1 
within the calculation of Z~ l V can be computed by special algorithm applied to the DPLR 
structure. The alternative way to compute the product X = Z~ l V is the iterative solution 
of the matrix equation 

ZX = V, with V = L v Ly, (3.4) 

and with the DPLR matrix Z. The above matrix equation can be solved in a low-rank 
format by using preconditioned iteration with rank truncation. 

The computational cost for setting up the full BSE matrix F in (13. 2j) can be estimated by 
0{N‘o V ), which includes the cost 0(N ov Rb ) for generation of the matrix V and the dominating 
cost 0(N° v N or b) for setting up of W. 

In the following, we rewrite spectral problem (13.2(1 in the equivalent form 



The main idea of the reduced basis approach proposed in this paper is as follows. Instead 
of solving the partial eigenvalue problem for Ending of, say, mo eigenpairs in equation (13. 5 p , 
we, first, solve the slightly simplified auxiliary spectral problem with a modified matrix F 0 
obtained from F\ by low-rank approximation of W and W from the matrix blocks A and R, 
respectively, i.e. by transforms 

A i—>■ A 0 := A e + V — W r and B t-G R 0 := V — W r . (3.6) 

Here we assume that the matrix V is already presented in low-rank format, inherited from 
the Cholesky decomposition of TEI matrix. 

The modified auxiliary problem reads 



This eigenvalue problem is much simpler than those in (13.2(1 since now the matrix blocks Ho 
and Bq are composed by diagonal and low-rank matrices. 

Having at hand the set of m 0 eigenpairs computed for the modified (reduced model) 
problem (13.7(1 . = (A n , (u n , v n ) T )}, we solve the full eigenvalue problem for the 

reduced matrix obtained by projection of the initial equation onto the problem adapted 
small basis set {^n} of size m 0 . 

Define a matrix G\ = 1 : m 0 ) G ]gi 2N °v xrn o whose columns present the vectors of 

reduced basis, compute the Galerkin and mass matrices by projection onto the reduced 
basis specified by columns in Gi, 

M 1 = G%F 1 G 1 , Si = GfGi e R moXmo , 
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and then solve the projected generalized eigenvalue problem of small size m 0 x mo, 

M 1 Y = 'y n S 1 Y, YeM mo . (3.8) 

The portion of small eigenvalues 7 n , n — 1, ...,m 0 , is thought to be very close to the corre¬ 
sponding excitation energies u n , [n = 1 , in the initial spectral problem ( 13 . 2j) . Table 

13.11 illustrates that the large the size of reduced basis m 0 , the better the accuracy of the 
lowest excitation energy 71 . 


m 0 

5 

10 

20 

30 

40 

50 

h 2 0 

0.025 

0.025 

0.14 

0.01 

0.01 

0.005 

n 2 h 4 

0.02 

0.02 

0.015 

0.015 

0.015 

0.005 


Table 3.1: The error ^ — uq| vs. the size of reduced basis, m 0 . 


Remark 3.1 Notice that the matrix W might have rather large e-rank for small values 
of e which increases the cost of high accuracy solutions. The results of numerical tests 
that follow (see Table 1/0) indicate that the rank approximation to the matrix W with the 
moderate rank parameter allows for the numerical error in the excitation energies of the 
order of few percents. Further improvement of the accuracy requires noticeable increase in 
the computational costs. To avoid this numerical payoff, we apply another approximation 
strategy in which the matrix W remains unchanged, while matrices V and W are substituted 
by their low-rank approximation (see Fiaure FFlA) . 

Matrix blocks in the auxiliary equation (13. 7|) are obtained by rather rough e-rank ap¬ 
proximation to the initial system matrix. However, we observe much better approximations 
7 n from (13. 8p to the exact excitation energies a; n from the equation (13.2p . This can be ex¬ 
plained by the well known effect of the quadratic error behavior of eigenvalues with respect 
to the perturbation error in the symmetric matrix. In the situation with equation (13.31) the 
corresponding statement can be easily proved under mild assumptions. 

Lemma 3.2 Let matrices A and B be real and both A — B and A + B be symmetric, positive 
definite. Suppose that the matrices in the system (CO)) are perturbed by A i->- A A + A A, 
B 1 — y B := B + A B, such that ||AH|| < e and ||AR|| < e. Then the error in the excitation 
energies, A u n = u n — u n , is estimated by 

\Aiw n \ < Ce 2 , 

provided that | 2 uj n + Au n \ > 6 > 0 uniformly in e. 

Proof. Denote by A n = tw 2 and X n = u\ the exact and perturbed eigenvalues in the trans¬ 
formed problem (13.31) . This problem is symmetric, hence we have 

| a? n — uj^\ = | A n — A n | < Ce 2 , 
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implying 


C 2 

\^n k)n\ — 1 j ~ r£ > 

\(jJn ' ^n | 

which proves the statement. ■ 

In the particular BSE formulation based on the Hartree-Fock molecular orbitals basis, we 
have the slightly perturbed symmetry in the matrix blocks, i.e. Lemma 13.21 does not apply 
directly. However, we observe the same quadratic error decay in all numerical experiments 
implemented so far. 

3.2.2 Numerics for the reduced basis methods 

In this section we present numerical illustrations to the reduced basis approach for the BSE 
problem, which use the TEI tensor and molecular orbitals obtained from the solution of the 
Hartree-Fock equation by the 3D grid-based tensor-structured method mm- All examples 
below utilize the grid representation of the Galerkin basis functions from Gaussian basis sets 
of type cc-pDVZ. The two-electron integrals are computed in the form of low-rank Cholesky 
factorization by tensor-structured algorithms incorporating ID density fitting [T5]. 

In the following numerical tests we demonstrate on the examples of moderate size 
molecules that a small reduced basis set, obtained by separable approximation with the 
rank parameters of about several tens, allows to reveal several lowest excitation energies 
and respective excited states with the accuracy about O.leV - 0.02eV depending on the 
rank-truncation strategy. Table 13.21 represents the size of GTO basis set, N b , and the 
number of molecular orbitals, N orb , in numerical examples considered below. 



h 2 o 

h 2 o 2 

n 2 h 4 

C 2 H 5 OH 

c 2 h 5 no 2 

^orb, N b 

5, 41 

9, 68 

9, 82 

13, 123 

20, 170 


Table 3.2: GTO basis set size N b and number of molecular orbitals, N orb , in considered 
examples. 


Table lT3l demonstrates the quadratic decay of the error |yi — oj\ in the lowest excitation 
energy with respect to the approximation error to the initial BSE matrix, which is controlled 
by a^tolerance e in the rank truncation procedure applied to the BSE submatrices V, W 
and W. The resulting e-ranks for the corresponding matrices are presented for H 2 0, N 2 H 4 
and C 2 H 5 OH molecules. The error for 1st eigenvalue, |yi — aq|, is given in Hartree (one 
Hartree corresponds to 27eV). This table demonstrates that the error in the reduced basis 
approximation, |yi — cni|, is at least one order of magnitude smaller than those for simplified 
problem, |Ai — |, which motivates the use of the reduced basis equation (13.81) . 

This effect can be also seen in Figure l3Tl demonstrating the convergence —> ui n and 
A n —> u n with respect to the increasing rank parameter determining the auxiliary problem 
(the size of reduced basis is mo = 30). It confirms the numerical observation (see also Table 
13.31) that 

| r )n ^n | | An iO n |, 
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£ 

O 

1 

to 

O 

1 

10" 1 

O 

1 

to 

h 2 o 

71 - 


10" 2 

10" 2 

8 • 10 ~ a 

8 • 10" 6 

|Ai - uq 


0.35 

0.35 

0.31 

4.4 • 10 - 4 

F\ — F 0 


1.17 

1.17 

8 - 10" 1 

2 • 10" 2 

ranks V,W,W 

6 , 9, 6 

13, 13, 10 

25, 72, 36 

60, 180, 92 

N 2 H 4 

7i - wi 


1.4- 10 " 2 

1.5- 10 " 2 

10" 2 

6 • 10~ 6 

|Ai - cui 


0.25 

0.25 

0.25 

2.8 • 10~ 4 

||Fj — F 0 1 

1.8 

1.0 

6 • 10" 1 

1.4- 10 " 2 

ranks V,W,W 

11, 17, 11 

26, 25, 15 

49, 144, 54 

117, 657, 196 

c 2 h 5 oh 

7i - wi 


3• 10" 2 

3• 10" 2 

1.5- 10 ~ 2 

6 • 10" 6 

Ai -wi 


0.29 

0.29 

0.27 

3• 10" 4 

H-Fj — Fo|| 

3.0 

1.5 

7• 10" 1 

1.4- 10 " 2 

ranks V,W,W 

16, 17, 14 

39, 29, 20 

71, 105, 74 

171, 1430, 296 


Table 3.3: Accuracy for 1-st eigenvalue, |yi — 07 1, and norms of the difference between the 
exact and reduced-rank matrices, ||.Fi — F 0 ||, vs. e-rank for V, W and W. The BSE matrix 
size is given in brackets: H 2 O (360 x 360), N 2 H 4 (1430 x 1430), C 2 H 5 OH (2860 x 2860). 


N 2 H 4 , eps= 2* 10 -1 



N H eps=10 1 



N H ,eps=10 2 



Figure 3.1: Comparison of ttiq = 30 lower eigenvalues for the reduced and exact BSE systems vs. 
e in the case of N 2 H 4 molecule. 

that justifies the efficiency of the reduced basis approach. Three figures from the left to the 
right correspond to the rank truncation threshold £ G {2 • 10” 1 , 10 -1 , 10~ 2 }. The quantities 
\ n , 7 n and (jj n are marked by black, blue and red lines, respectively. Notice that for e = 10 ~ 2 
the energies (eigenvalues) for the initial and reduced systems are practically coinciding (error 
of the order of 10 -6 ), at the expense of large separation rank, see Table IHTil 

Figure IX^l represents similar data as in Figures IHTT1 but for amino-acid Glycine, C 2 H 5 NO 2 , 
with the BSE matrix size 6000 x 6000. In this case truncation threshold e = 2 • 10 _1 leads to 
the rank parameters Ry = 54 , R^y = 50, R ^ = 50, and the error for the minimal eigenvalue, 
uji = 0.2432 hartree, equals to 0.027 hartree. For e = 10 -1 we have the rank parameters 
Ry = 100 , Rjy = 215, R^y = 129, and the error for the minimal eigenvalue equals to 0.014 
hartree (0.38eV), while the choice e = 10 -2 again ensures the accuracy of the order of 10 -6 . 

Figure IHT51 left, demonstrates the lowest part of the BSE spectrum corresponding to the 
reduced EVP in the case of a water molecule, H 2 0, (rank truncation threshold e = 10” 1 ). 
The five lowest values of the computed excitation energy are approximately 8.7eV, 10.7eV, 
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Glycine, eps=2'10“’ 





Figure 3.2: Comparison of mo = 30 lower eigenvalues for the reduced and exact BSE systems vs. 
e in the case of Glycine amino acid. 


ll.leV, 12.8eV, and 14.5eV, see cf. dH. Figure 13.31 center and right, illustrates the BSE 
energy spectrum of the H 2 0 molecule (based on HF calculations with cc-pDVZ-48 GTO 
basis) for the lowest N re d = 30 eigenvalues vs. the rank truncation parameter e = 0.6 and 
0.1, where the ranks of V and the BSE matrix block W are 4, 5 and 28, 30, respectively, 
while the block W remains unchanged. For the choice e = 0.6 and £ = 0.1, the error in 
the 1st (lowest) eigenvalue for the solution of the problem in reduced basis is about O.lleV 
and 0.025eV, correspondingly. The CPU time in the laptop Matlab implementation of each 
example is about 5sec. 





Figure 3.3: Comparison of mo = 30 lower eigenvalues for the reduced and exact BSE systems for 
H 2 0 molecule: e = 0.6, center; e = 0.1, right. 


3.2.3 Comparison with the Tamm-Dancoff approximation 

It is interesting to compare the full BSE model with the so-called Tamm-Dancoff approxi¬ 
mation (TDA) [0J, which corresponds to setting the matrix B = 0 in equation (13.2[) . This 
simplifies the equation (13. 2 p to a standard Hermitian eigenvalue problem 

Axn = n n x n , x n G R Nov (3.9) 

with the reduced matrix size N ov . The reduced basis approach via low-rank approximation 
described in §3.2.11 can be applied directly to this equation. 

Below we present numerical tests indicating that the approximation error introduced 
by TDA method compared with the initial BSE system (13.2j) remains on the level of 0.003 
hartree for several compact molecules, see Figure [XU 
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Figure 3.4: Comparison between m o = 30 lower eigenvalues n n and u n for the TDA and full BSE 
models, respectively, on the examples of H 2 O, N 2 H 4 , and C 2 H 5 OH molecules; e = 0.1. 

Table l3~4l indicates a tendency to decrease the TDA model error for larger molecules. 



h 2 o 

H 2 0 2 

N 2 H 4 

C 2 H 5 OH 

Glycine 

err(hartree) 0.0031 

0.0051 

0.0022 

0.0024 

0.0017 


Table 3.4: The model error |/ii — cui| in TDA approximation for different molecules, £ = 0.1. 


4 Conclusions 

The new reduced basis method for solving the BSE equation based on the low-rank approxi¬ 
mation of matrix blocks was presented and analyzed. The potential efficiency of the approach 
is demonstrated numerically on the solution of large scale Bethe-Salpeter eigenvalue problem 
for some moderate size molecules and small amino-acid^]. The e-rank bounds for the re¬ 
quested sub-tensors of the TEI tensor, represented in the molecular orbitals (MO) basis set, 
were proven. We justify the quadratic error behavior in the excitation energies with respect 
to the accuracy of the rank approximation. Asymptotic estimates on the storage demands 
are provided. 

The basic computational scheme of the reduced basis method include: 

1. Precomputing step I: Given the set of Gaussian type orbitals, compute the related TEI 
tensor in the form of low-rank Cholesky decomposition. 

2. Precomputing step II: Calculate the MO basis set and related energy spectrum by 
solving the Hartree-Fock eigenvalue problem. 

3. Project the TEI tensor onto the MO basis set in the form of low-rank factorization. 

^Eor ab-initio electronic structure calculations we use the tensor-structured Hartree-Fock solver DUE] 
implemented in Matlab, employing the rank-structured calculation of the core Hamiltonian and TEI, using 
the discrete representation of basis functions on n x n x n 3D Cartesian grids. The arising 3D convolution 
integrals with the Newton kernel are replaced by algebraic operations in ID complexity. 
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4. Compute the diagonal plus low-rank approximations to the matrix blocks A and B 
and set up the auxiliary eigenvalue problem via rank-structured approximation to the 
BSE matrix. 

5. Select the reduced basis set from eigenvectors corresponding to several lowest eigen¬ 
states of the auxiliary structured eigenvalue problem. 

6. Compute the Galerkin projection of the exact BSE system matrix onto the reduced 
basis set and solve a small size reduced spectral problem by direct diagonalization. 

We demonstrate that the approximation error of the reduced basis method (1) - (6) can 
be reduced dramatically if the matrix block W remains unchanged. We also analyze the 
numerical error in the simplified BSE model, the so-called Tamm-Dancoff approximation 
(TDA), specified by the first diagonal matrix block A. 

The various numerical tests demonstrate that the reduced basis set obtained by solving 
the auxiliary eigenvalue problem based on the low-rank approximation to BSE matrix blocks 
(with the adaptively chosen rank parameter of the order of several tens) allows to achieve 
the sufficient accuracy for several lowest excited states. We justify numerically that the 
simplihed TDA equation is characterized by the model error of the order of 0.003 hartree 
(0.08 eV) for all molecular systems considered so far, with the tendency to decrease for large 
molecules, say 0.0017 hartree (0.045 eV) for Glycine amino acid. We note in closing that 
here we mainly focus on the numerical efficiency of the new computational scheme with 
respect to the accuracy vs. separation rank, tested on the Hartree-Fock-BSE and Hartree- 
Fock-BSE-TDA calculations for some moderate size molecules. 

The future work is concerned with the design of the efficient linear algebra algorithms 
for fast solution of arising large eigenvalue problems with diagonal plus rank-structured 
matrices. The approach can be also extended to the case of finite non-periodic lattice systems 
(e.g. quantum dots or nanoparticles) providing gainful opportunities for data-sparse matrix 
calculus. 

Another possible direction includes the quantized tensor approximation (QTT) [17] of 
the matrices involved in order to perform the super-fast matrix-vector calculations in the 
QTT tensor arithmetics with the e-rank truncation (see e.g. 0). 

Acknowledgements. The authors would like to acknowledge Prof. A. Savin (UPMC, 
Paris) and Prof. J. Toulouse (UPMC, Paris) for valuable comments on the problem setting 
for the BSE model and for providing the useful references. 

5 Appendix: The Hartree-Fock model 

The 2A' r or fe-electrons Hartree-Fock equation for pairwise L 2 -orthogonal electronic orbitals, 
i/ji : M 3 —y M, ijji E i7 1 (M 3 ), reads as 

Bi/jiix) = / 'ipi'ipjdx — Sij, i,j — 1,..., N orb , (5.1) 

J R 3 

with T being the nonlinear Fock operator 

+ V c + V H + JC. 
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Here the nuclear potential takes the form 


M 


V c (x) — — 77--— 77 , Z v >0, a u G M 3 , 

\\x — a- II 


v=\ 


while the Hartree potential Vh(x) and the nonlocal exchange operator K, read as 


Vh(x) := p-k 


p(y ) 

\x-y\\ 


dy, x G R 3 , 


and 


(/C^) (x) := - M x ) = 


i=1 


T{x,y) 

\\ x ~y\\ 


fp(y)dy, 


(5.2) 


(5.3) 


respectively. Conventionally, we use the definitions 

N 0 rb 


r(x,y) :=2^ ipi{x)^i{y), p(x) :=t(x,x), 


i=1 


for the density matrix r(x,y), and electron density p(x). 

Usually, the Hartree-Fock equation is approximated by the standard Galerkin projection 
of the initial problem (15.ip posed in H 1 ( R 3 ). For a given finite Galerkin basis set 
gn G i/ 1 (M 3 ), the occupied molecular orbitals ^ are represented (approximately) as 

N b 

"0 'i ^ ^ ^fiig^ii i I; N orb . (5.4) 

m=i 

To derive an equation for the unknown coefficients matrix C = {C^j} G ~R NbXNorb , first, we 
introduce the mass (overlap) matrix S = {Sfj,u}i<fj.,u<N b , given by 


S, 


fllS 


gtfudx, 


and the stiffness matrix H = {h^ u } of the core Hamiltonian J~L = —|A + V c (the single¬ 
electron integrals), 


hp» = - / V gp • Wg v dx + 
2 Jr s 


V c (x)g /J ,g I/ dx, 1 < /i, u < N b . 


The core Hamiltonian matrix H can be precomputed in 0(N b ) operations via grid-based 
approach. 

Given the finite basis set {gfj}i<n<N b , .9/, G H ^R 3 ), the associated fourth order two- 
electron integrals (TEI) tensor, B = [b lwX(T \, is dehned entrywise by 


b[iv\cr 



g li (x)g„(x)g x (y)ga(y) 


\x 


y II 


dxdy, p, is, A, a G {1,..., N b } =: X b . (5.5) 


In computational quantum chemistry the nonlinear terms representing the Galerkin ap¬ 
proximation to the Hartree and exchange operators are calculated traditionally by using the 
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low-rank Cholesky decomposition of the TEI tensor B = [b^ K x\ as defined in (15.51) . (12.1|1 
that initially has the computational and storage complexity of order 0(N£). 

Introducing the iV& x Nf, matrices J(D) and K(D), 

N b N b 

J{D ) M „ = b^ KX D KX , K(D) IU/ = ^ bfix^D ", A , (5.6) 

k,A=1 k,,A=1 

where D = 2 CC T G R NiXNb is the rank-A^ 6 symmetric density matrix, one then represents 
the complete Fock matrix F by 


F(D) — H + J(D) + K(D). (5.7) 

The resultant Galerkin system of nonlinear equations for the coefficients matrix C G 
M N b xN orb; anc [ the respective eigenvalues A, reads as 

F(D)C = SCA, A = diag(Xi, ..., X Nb ), (5.8) 

C T SC = I N , 

where the second equation represents the orthogonality constraints f R3 iljiifjjdx = Sij, and In 
denotes the Nb x Nb identity matrix. 
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